An Automated Annotation Process for the SciDocAnnot Scientific Document Model
نویسندگان
چکیده
Answering precise and complex queries on a corpus of scientific documents requires a precise modelling of the document contents. In particular, each document element must be characterised by its discourse type (hypothesis, definition, result, method, etc.). In this paper we present a scientific document model (SciAnnotDoc) that takes into account the discourse types. Then we show that an automated process can effectively analyse documents to determine the discourse type of each element. The process, based on syntactic rules (patterns), has been evaluated in terms of precision and recall on a representative corpus of more than 1000 articles in Gender studies. It has been used to create a SciDocAnnot representation of the corpus on top of which we built a faceted search interface. Experiments with users show that searching with this interface clearly outperforms standard keyword search for complex queries.
منابع مشابه
Automatic Workflow Generation and Modification by Enterprise Ontologies and Documents
This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...
متن کاملAutomatic Workflow Generation and Modification by Enterprise Ontologies and Documents
This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملFuzzy Neighbor Voting for Automatic Image Annotation
With quick development of digital images and the availability of imaging tools, massive amounts of images are created. Therefore, efficient management and suitable retrieval, especially by computers, is one of themost challenging fields in image processing. Automatic image annotation (AIA) or refers to attaching words, keywords or comments to an image or to a selected part of it. In this paper,...
متن کاملRole of Scientific Authority in the Development Process in Iran: A Systematic Review
Objectives: Scientific authority which means others’ continuous referral to an individual or organization and being recognized as a theory-maker, leads to develop a society socially, economically and scientifically. The goal of this study was to explain the role of scientific authority in the development process of the country based on the conducted studies. Method: This study was conducted in...
متن کامل